Swoosh: a generic approach to entity resolution

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

P-Swoosh: Parallel Algorithm for Generic Entity Resolution

Entity Resolution (ER) is a problem that arises in many information integration applications. ER process identifies duplicated records that refer to the same real-world entity (match process), and derives composite information about the entity (merge process). Additionally, the merged record can match another records recursively. Since the ER process is typically compute-intensive, it is import...

متن کامل

Developments in Generic Entity Resolution

Entity resolution (ER) is the problem of identifying which records in a database refer to the same entity. Although ER is a well-known problem, the rapid increase of data has made ER a challenging problem in many application areas ranging from resolving shopping items to counter-terrorism. The SERF project at Stanford focuses on providing scalable and accurate ER techniques that can be used acr...

متن کامل

A generic Web-based entity resolution framework

Web data repositories usually contain references to thousands of real-world entities from multiple sources. It is not uncommon that multiple entities share the same label (polysemes) and that distinct label variations are associated with the same entity (synonyms), which frequently leads to ambiguous interpretations. Further, spelling variants, acronyms, abbreviated forms, and misspellings comp...

متن کامل

A Machine Learning approach to Generic Entity Resolution in support of Cyber Situation Awareness

This paper introduces the Generic Entity Resolution (GER) framework; a framework that classifies pairs of entities as matching or non-matching based on the entities’ features and their semantic relationships with other entities. The GER framework has been developed as part of an AI-based system for the development of Cyber situational awareness and provides a data fusion role by resolving entit...

متن کامل

Generic Entity Resolution with Data Confidences

We consider the Entity Resolution (ER) problem (also known as deduplication, or merge-purge), in which records determined to represent the same real-world entity are successively located and merged. Our approach to the ER problem is generic, in the sense that the functions for comparing and merging records are viewed as black-boxes. In this context, managing numerical confidences along with the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: The VLDB Journal

سال: 2008

ISSN: 1066-8888,0949-877X

DOI: 10.1007/s00778-008-0098-x